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In iTempletonI (|2010l ). the Approximate Bayesian C ompu- 
tation (ABC) algorithm (see, e.g., IP ritchard ct aO. |1999|. 
Beaumont et alj. 120021 . iMarioram et al.l . r2003l , iRatmann et al.l . 



20091 ') is criticised on mathematical and logical grounds: "the 
[Bayesian] inference is mathematically incorrect and formally 
illogical". Since those criticisms turn out to be bearing 
on Bayesian foundations rather than on the computational 
methodology they are primarily directed at, we endeavour to 
p oint out in this not e the statisti cal errors and inconsist encies 
in ITempletonI (|2010r ). refering to lBeaumont et al] (|2010l ) for a 
reply that is broader in scope since it also covers the phyloge- 
netic aspects of nested clade versus a model-based approach. 



Coherence 

ITempletonI l|2010l ') mostly uses arguments found i n ITempletonI 
(|2008l ) and already answered in lBeaumont et aTl l|201o[ ). How- 
ever, the tone adopted in this PNAS paper is harsher and 
has a wider scope than in the earlier paper, in that it 
contains a foundational if inappropriate critical perspective 
on Bayesian model comparison. All of the arguments pre- 
sented in Tem pleton's tribune against the ABC "method" 
(jTavare et al.l , |l997) actually aim at exposing the incoher- 
ence of the Bayesian approach. The major point of con- 
tention is that Bayes factors are mathematically incorrect be- 
cause they contradict basic logic by bei n g inc oherent. The 
notio n of coherence used i n , Templetonl l|2010l ') is borrowed 
from lLavine and SchervishI [l99^ ). Those authors introduced 
this notion to criticise Bayes factors in the limited sense that 
those may be nonmonotonous in the alternative hypothesis — 
in cases when monotony is relevant — , and thus that posterior 
probabilities — which are coherent — should be used instead in 
a correct decision theoretic perspective. 



Bayes factors 

The core of the Bayesian paradigm is to incorporate all as- 
pects of uncertainty within a prior distribution on the param- 
eter space and all aspects of decision consequences within a 
loss function in order to produce a sin gle inferential machine 
that provides the "optimal" solution (iBergerl. 1 19851). Poste- 



rior p robabilities and hence Bayes factors i Kass and Raftervl 
Il995l ) are the product of this inferential machine when the 
goal is the selection of a statistical model. We recall that a 
Bayes factor, of the form 

mi (a;) 

compares the marginal likelihoods at the observed data x 
under both mod els under comparison. The suggestion of 
ITempletonI (|2010| ) to "incorporate the sampling error of the 
observed statistic" is therefore exhibiting a misunderstanding 
of the above Bayesian construction since the posterior dis- 



tributions naturally incorporate the sampling errors /i(a::|6li) 
and f2{x\02) under both models. 

Templeton's (2010) first argument against Bayes factors, 
namely that "the probability of the nested special case must 
be less than or equal to the probability of the general model 
within which the special case is nested. Any statistic that as- 
signs greater probability to the special case is incoherent", 
proceeds from the "natural" argument that larger models 
should have larger probabilities by an encompassing analogy. 
(Note that the notion of defining "the" probability over the 
collection of models that Templeton seems to take for granted 
does not make sense outside a Bayesian framework.) The au- 
thor presents a Venn diagram to further explain why a larger 
set should have a larger measure, as if this simple-minded 
analogy was relevant in model choice settings. We found sim- 
ilar ar guments in t he recent epis temological book by Sobe^ 
(|2008D as well as in iPopperl ||1959D . This reductive viewpoint 
does not account for the fact that in Bayesian model choice, 
different models induce different parameters spaces and that 
those parameter spaces are endowed with orthogonal mea- 
sures, especially if those spaces are of different dimensions. 
When the smaller parameter space corresponds to the restric- 
tion 9i = 0, the measure of this constraint is zero in the larger 
space, i.e. P{9i = 0\M2) = 0, whe n the p arameter space is 
continuous. As stressed bv lJeffrev3 ||1939D . testing for point 
null hypotheses (and hence for nested models) requires a dras- 
tic change of dominating measure so that both the null and 
the alternative hypotheses have a positive probability. This 
implies defining versions of the prior distribution over both 
parameter spaces. Therefore, talking of nested models hav- 
ing a "smaller" probability than the encompassing model or 
of "partially overlapping models" does not make sense from a 
measure theoretic (hence mathematical) perspective. In other 
words, the measure of the event is conditional on the model 
considered. (The fifty-one occurences of the words coherent 
or incoherent in the paper do not bring additional scientific 
weight to the argument.) 



Bayesian model comparison 

When ITempletonI lj2010l ) calls to logic for rejecting "incoher- 
ent" probability orderings on models, he rejects the fact that 
marginal likelihoods are in the same scale and can be added 
within the denominator of posterior probabilities, namely 



Pr(Af,|a;) = 



Iliini{x) 



J2^-^njmj{x) ' 



using standard notations (jBergeH . 1 19851 . iRobertl |2001D . His 

argument is that the denominator is proportional to the prob- 
ability of the union of several models and hence that the prob- 
abilities of the intersections of the overlapping hypotheses [or 
models] must be subtracted" . Another Venn diagram explains 
why this basic consequence of Bayes formula is "mathemat- 
ically and logically incorrect" and why marginal likelihoods 
cannot be added up when models "overlap". According to 
Templeton, "there can be no universal denominator, because 
a simple sum always violates the constraints of logic when 
logically overlapping models are tested". Once more, this 
simply shows a poor understanding of the probabilistic mod- 
elling involved in model choice: The argument fails because 
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of the measure-theoretic assumptions separating models and 
because model choice ultimately involves the selection of one 
single model, hence the rejection of all other models. There 
cannot be a posterior weight on any intersection for this rea- 
son. 

A second criticism of ABC (i.e. of the Bayesian approach) 
is that model choice requires a collection of models and cannot 
decide outside this finite and therefore incomplete collection. 
The very purpose of a Bayesian model choice procedure ex- 
actly aims at selecting the most likely model among all avail- 
able, rather than rejecting a given model when the data is un- 
likely. Studies like Bcrgcr and ScUkc (1987) have shown the 
difficulty o f rea soning within a single model. Furthermore, 
iTempletonI (j201Q ) advocates the use of a likelihood ratio test, 
which necessarily implies using two models with one nested 
within the other. 

In this paper, Templeton also reiterates the earlier (2008) 
criticism that marginal likelihoods are not comparable across 
models, because they "are not adjusted for the dimensionality 
of the data or the models" (sic!). This point is missing the 
whole purpose of using marginal likelihoods, namely that they 
account for the dimensionality of the parameter by providing a 
natural Ockham's razor (MacKay. ,2002.) penalising the larger 
model without requiring to specify a dim ension penalty. Both 
BIG and DIG (|Spiegelhalter et al.1 . |2002| ) are approximations 
to the exact Bayesian evidence, which shows the intrinsic pe- 
nalisation thus provided by marginal likelihoods. Note also 
that ABG applies the basic principles of a Bayesian model 
compari son to a sum mary statistic that is common across 
models (|Grelaud et a l.. 2009), rather than using model spe- 
cific summary statistics which would then be inconsistent. 



Implications of model criticism 

The point corresponding to the quote "ABG is used for param- 
eter estimation in addition to hypothesis testing and another 
source of incoherence is suggested from the internal discrep- 
ancy between the posterior probabilities generated by ABG 
and the parameter estimates found by ABG" is that, while the 
posterior probability that 6i = (model Mi) is much higher 
than the posterior probability of the opposite (model M2), the 
Bayes estimate of 9i under model M2 is "significantly different 
from zero". Again, this reflects both a misunderstanding of 
the probability model, namely that &i = is impossible [has 
measure zero] under model AI2 , and a confusion between con- 
fidence intervals (that are model specific) and posterior prob- 
abilities (that work across models). The concluding message 
that "ABG is a deeply flawed Bayesian procedure in which 
ignorance overwhelms data to create massive incoherence" is 
thus unsubstantiated. 



ABC is only a Monte Carlo scheme 

An issue common to all recent criticisms bv'Te mpletonI (|2008l . 
[20101 ) is the misleading or misled confusion between the ABG 
method and the resulting Bayesian inference. For instance, 
Templeton distinguishes between the incoherence in the ABG 
model choice procedure from the incoherence in the Bayes 
factor, when ABG is used as a computational device to ap- 
proximate the Bayes factor. In the current case, the Bayes 
factor can be directly derived from the ABG simulation since 
the (accepted or rejected) proposed values are simulated from 
■K(6)f{x\6) (modulo a numerical approximation effect). This 
does not turn the Bayes factor into an ABG or simulation- 
based quantity. There is therefore no inferential aspect linked 



with ABG, per se, it is simply a numerical tool to approximate 
Bayesian procedures and, with enough computer power, the 
approximation can get as precise as one wishes. 

One of the arguments in Templeton (2010) relies on the 
following representation of the "ABG equation" (sic!) 

p/rr \rT q*\ _ Gi{\\Si — S*\\)Tli 

^ ' ' E;=iG,(||s,-5*|!)n, 

where S* is the observed summary statistic, 5"; is "the vec- 
tor of expected (simulated) summary statistics under model 
i" and "Gi is a goodness-of-fit measure". Templeton states 
that this "fundamental equation is mathematically incorrect 
in every instance (..) of overlap." This representation of the 
ABG approximation is again misleading or misled in that the 
simulation algorithm ABG produces an approximation to a 
posterior sample from ■Ki{9i\S*). The resulting approximation 
to the marginal likelihood under model Mi is a regular Monte 
Garlo step that replaces an integral with a weighted sum (an 
average), not a "goodness-of-fit measure" and the 5'i's are 
replicated many times. The subsequent argument of Temple- 
ton's about the goodness-of-fit measures being "not adjusted 
for the dimensionality of the data" (re-sic!) and the resulting 
incoherence is therefore void of substance. The following ar- 
gument repeats an misunderstanding stressed above with the 
probabilistic model involved in Bayesian model choice: the 
reasoning that, if 

En. = i 

"the constraints of logic are violated [and] the prior proba- 
bilities used in the very first step of their Bayesian analysis 
are incoherent" , does not assimilate the issue of measures over 
mutua lly exclusiv e spac es . 

In ITempletonI (|201 0), ABG is presented as allowing sta- 
tistical comparisons among simulated models: "ABG assigns 
posterior probabilities to a finite set of simulated a priori mod- 
els." The simulation aspect is treated with suspicion and op- 
posed to "standard classical tests", even though the method 
is simply replacing an intractable integral with a convergin 
average. Once more, there is no statistical flaw that can be 
attributed to ABG since this is a purely numerical method. 
The models under comparison are therefore the same as those 
studied by "standard classical tests" and what is simulated is 
a sample from the posterior distribution associated with this 
model, not the model itself. 
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